[runtime] Add page cache metrics#3544
Conversation
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
commonware-mcp | be6cc74 | Apr 22 2026, 05:36 AM |
fee5f1f to
850bfdf
Compare
Deploying monorepo with
|
| Latest commit: |
be6cc74
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://8b6b8823.monorepo-eu0.pages.dev |
| Branch Preview URL: | https://page-fault-metric.monorepo-eu0.pages.dev |
There was a problem hiding this comment.
Pull request overview
Adds observability for the paged page-cache by introducing a page_faults Prometheus counter and wiring metric registration through the runtime Metrics interface (via BufferPooler).
Changes:
- Add a
page_faultscounter toCacheRef, incremented when a cache miss enters the page-fault path. - Change
CacheRef::newto accept aMetricscontext and register the new counter. - Make
BufferPoolerextendMetricsso callers usingfrom_poolerdon’t need to thread a separate metrics parameter.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| runtime/src/utils/buffer/paged/cache.rs | Registers and increments a new page_faults counter in the page-cache fault handler; updates constructor API and tests. |
| runtime/src/utils/buffer/paged/append.rs | Updates a test call site to the new CacheRef::new(&impl Metrics, ...) signature. |
| runtime/src/lib.rs | Updates BufferPooler to be a supertrait of Metrics to support metric registration from pooler-based construction. |
850bfdf to
80e0c86
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Page fault counter incremented before confirming actual fault
- Moved page_faults.inc() to after the write-lock cache recheck so the counter only increments for confirmed page faults, not when another task cached the page concurrently.
Or push these changes by commenting:
@cursor push f4af0d3b46
Preview (f4af0d3b46)
diff --git a/runtime/src/utils/buffer/paged/cache.rs b/runtime/src/utils/buffer/paged/cache.rs
--- a/runtime/src/utils/buffer/paged/cache.rs
+++ b/runtime/src/utils/buffer/paged/cache.rs
@@ -309,8 +309,6 @@
let (page_num, offset_in_page) = Cache::offset_to_page(self.page_size, offset);
let offset_in_page = offset_in_page as usize;
- self.page_faults.inc();
- trace!(page_num, blob_id, "page fault");
// Create or clone a future that retrieves the desired page from the underlying blob. This
// requires a write lock on the page cache since we may need to modify `page_fetches` if
@@ -325,6 +323,10 @@
return Ok(count);
}
+ // Only count as a page fault after confirming the page is not in cache.
+ self.page_faults.inc();
+ trace!(page_num, blob_id, "page fault");
+
let key = (blob_id, page_num);
match cache.page_fetches.entry(key) {
Entry::Occupied(o) => {This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
80e0c86 to
2269b87
Compare
0ff6666 to
2c1f46f
Compare
| let init_cache = | ||
| CacheRef::from_pooler(context.with_label("init"), PAGE_SIZE, PAGE_CACHE_SIZE); | ||
| let mut mmr = Mmr::<_, Digest>::init( | ||
| context.with_label("init"), |
There was a problem hiding this comment.
should not be same name?
| // init_sync with range starting beyond the existing data triggers the | ||
| // "fresh start" path (clear_to_size). | ||
| let sync_cache = | ||
| CacheRef::from_pooler(context.with_label("sync"), PAGE_SIZE, PAGE_CACHE_SIZE); |
There was a problem hiding this comment.
same name as MMR below
| let mut replay_blob = | ||
| ReadBuffer::from_pooler(&context, blob.clone(), *size, config.replay_buffer); | ||
| let mut replay_blob = ReadBuffer::from_pooler( | ||
| context.with_label("replay"), |
There was a problem hiding this comment.
This could lead to conflicting context?
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Double "network" label in setup_network_with_participants calls
- Changed all 16 call sites from context.with_label("network") to context.clone() since setup_network_with_participants already applies with_label("network") internally.
Or push these changes by commenting:
@cursor push 074a42aec8
Preview (074a42aec8)
diff --git a/consensus/src/marshal/mocks/harness.rs b/consensus/src/marshal/mocks/harness.rs
--- a/consensus/src/marshal/mocks/harness.rs
+++ b/consensus/src/marshal/mocks/harness.rs
@@ -1994,7 +1994,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -2178,7 +2178,7 @@
let attacker = participants[1].clone();
let peers = vec![victim.clone(), attacker.clone()];
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
peers.clone(),
)
@@ -2297,7 +2297,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -2374,7 +2374,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -2470,7 +2470,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -2557,7 +2557,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -2767,7 +2767,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -2854,7 +2854,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -2932,7 +2932,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -3027,7 +3027,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -3092,7 +3092,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -3180,7 +3180,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(3),
participants.clone(),
)
@@ -3293,7 +3293,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -3365,7 +3365,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -3498,7 +3498,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)
@@ -3587,7 +3587,7 @@
..
} = bls12381_threshold_vrf::fixture::<V, _>(&mut context, NAMESPACE, NUM_VALIDATORS);
let mut oracle = setup_network_with_participants(
- context.with_label("network"),
+ context.clone(),
NZUsize!(1),
participants.clone(),
)This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
# Conflicts: # storage/fuzz/fuzz_targets/fixed_journal_operations.rs
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ed497e8. Configure here.
| }; | ||
| let (engine, engine_mailbox) = | ||
| Engine::<_, PublicKey, TestMessage, _>::new(context.clone(), config); | ||
| Engine::<_, PublicKey, TestMessage, _>::new(context.child("engine"), config); |
There was a problem hiding this comment.
just take context from above as-is?
| }; | ||
| let (engine, mailbox) = | ||
| Engine::<_, PublicKey, TestMessage, _>::new(engine_context.clone(), config); | ||
| Engine::<_, PublicKey, TestMessage, _>::new(engine_context.child("engine"), config); |
There was a problem hiding this comment.
just take context as-is?
| mailboxes.insert(peer_b.clone(), mailbox_b); | ||
| for (peer, network) in registrations { | ||
| let ctx = context.with_label(&format!("peer_{}", peer)); | ||
| let ctx = context.child("peer").with_attribute("peer", &peer); |
There was a problem hiding this comment.
nit: .child(engine)?
# Conflicts: # consensus/src/marshal/coding/marshaled.rs # consensus/src/marshal/standard/deferred.rs # consensus/src/marshal/standard/inline.rs # storage/src/qmdb/benches/merkleize.rs
# Conflicts: # runtime/src/utils/mod.rs
Codecov Report❌ Patch coverage is @@ Coverage Diff @@
## main #3544 +/- ##
==========================================
+ Coverage 95.87% 95.92% +0.05%
==========================================
Files 440 440
Lines 172054 174371 +2317
Branches 4001 3995 -6
==========================================
+ Hits 164958 167270 +2312
- Misses 5827 5831 +4
- Partials 1269 1270 +1
... and 2 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|


page_faultsmetric toCacheRef(page cache), incremented on every cache miss that enters the fault handlerpage_evictionsmetric toCacheRefthat is incremented when an old page has to be evicted to make room for a new one.